Tutorial 12 - NUPACK calls

Multistrand is equipped with Python wrapper functions for many of Nupack's executables, including pfunc, energy, defect, prob, sample, mfe, and pairs. Here we provide usage examples for all provided wrapper functions. See the Nupack User Manual for more information about the arguments to each executable. Many of these examples are adapted from examples provided with Nupack. Note that these functions assume you have Nupack version 3.0.2 or higher. Certain functions, such as sample(), will not work with earlier versions of Nupack.

In [1]:
# In general, the following arguments are required:
#   sequences  - a list of DNA/RNA sequences
#   structure  - a dot-paren structure (required by energy, prob, and defect)
#   energy_gap - the max allowed energy gap from the minimum free energy (required by subopt)
#   samples    - an integer specifying how many samples to draw (required by sample)
#
# The following optional arguments may also be included in all function calls:
#   ordering   - a list of integer indices indicating the ordering of the given strand sequences
#                in the complex, or None if each strand is distinguishable, occurs exactly once,
#                and is ordered as in the sequences argument. Indices may be repeated to indicate
#                identical, indistinguishable strands. This argument is ignored if multi == False.
#                (default: None)
#   material   - a string representing the material parameters to use (e.g. 'rna' or 'dna')
#                The Nupack User Manual has a complete list of possible values.
#                (default: 'rna')
#   multi      - whether or not to allow multistrand structures with identical strands
#                only works with DNA (default: True)
#   pseudo     - whether or not to allow pseudoknotted structures
#                only works with RNA (default: False)
#   dangles    - 'none', 'all', or 'some' to specify what types of dangle corrections to use (default: 'some')
#   T          - temperature of the system in degrees C (default: 37)
#   sodium     - concentration of Na+ in molar (default: 1.0)
#   magnesium  - concentration of Mg++ in molar (default: 0.0)
#
# The following optional arguments are allowed by only some functions:
#   cutoff     - (for use with pairs) probabilities less than cutoff are excluded (default: 0.001)
#   degenerate - (for use with mfe) degenerate structures are all returned (default: False)
#   mfe        - (for use with defect) returns mfe defect rather than ensemble defect (default: False)

Import the NUPACK wrapper:

In [2]:
from nupack import *
Python interface to NUPACK 3.0 (Pierce lab, Caltech, www.nupack.org) loaded.
In [3]:
## Sequences used throughout this file:
rna_seq = ['GGGCUGUUUUUCUCGCUGACUUUCAGCCCCAAACAAAAAAUGUCAGCA'];
dna_seqs = ['AGTCTAGGATTCGGCGTGGGTTAA',
            'TTAACCCACGCCGAATCCTAGACTCAAAGTAGTCTAGGATTCGGCGTG',
            'AGTCTAGGATTCGGCGTGGGTTAACACGCCGAATCCTAGACTACTTTG']
dna_struct = '((((((((((((((((((((((((+))))))))))))))))))))))))((((((((((((.(((((((((((+........................))))))))))).))))))))))))'

Examples with pfunc()

Find the partition function of this set of three DNA strands.

In [4]:
pfunc(dna_seqs, material = 'dna')
Out[4]:
-62.66453454292297

Find the partition function of this single RNA strand, including pseudoknotted structures.

In [5]:
pfunc(rna_seq, material = 'rna', multi = False, pseudo = True)
Out[5]:
-17.2786076

Examples with energy()

Find the energy (in kcal/mol) of this DNA structure.

In [6]:
energy(dna_seqs, dna_struct, material = 'dna')
Out[6]:
-61.79270283815892

Examples with defect()

Find the ensemble defect of this DNA structure.

In [7]:
defect(dna_seqs, dna_struct, material = 'dna')
Out[7]:
8.297

Find the MFE defect of this DNA structure.

In [8]:
defect(dna_seqs, dna_struct, material = 'dna', mfe = True)
Out[8]:
2.0

Examples with prob()

Find the probability of this DNA structure.

In [9]:
prob(dna_seqs, dna_struct, material = 'dna')
Out[9]:
7.992e-05

Examples with sample()

Boltzmann sample 3 structures with these DNA strands.

In [10]:
sample(dna_seqs, 3, material = 'dna')
Out[10]:
['.(((((((((((((((((((((..+..)))))))))))))))))))))..(((((((((((((((((((((((+........................))))))))))))))))))))))).',
 '((((((((((((((((((((((..+..)))))))))))))))))))))).(((((((((((((((((((((((+..(.....)...............))))))))))))))))))))))).',
 '((((((((((((((((((((((((+))))))))))))))))))))))))(((((((((((((((((((((((.+......(.......)..........)))))))))))))))))))))))']

Examples with mfe()

Find the MFE structure of this ordering of DNA sequences (the energy is also provided).

In [11]:
mfe(dna_seqs, material = 'dna')
Out[11]:
[('((((((((((((((((((((((((+))))))))))))))))))))))))((((((((((((((((((((((((+........................))))))))))))))))))))))))',
  '-65.913')]

Examples with pairs()

Find the pair probabilities of the following unpseudoknotted DNA strands. The output is a list of tuples of the form (i, j, p) where p is the probability that the ith and jth bases are bound.

In [12]:
pairs(['ACGT','GCTT'], material = 'dna')
Out[12]:
[(1, 7, 0.031271),
 (1, 8, 0.049495),
 (2, 5, 0.2163),
 (3, 6, 0.73677),
 (1, 9, 0.91923),
 (2, 9, 0.7837),
 (3, 9, 0.26323),
 (4, 9, 1.0),
 (5, 9, 0.7837),
 (6, 9, 0.26323),
 (7, 9, 0.96873),
 (8, 9, 0.95051)]
In [ ]: